Local Statistics for Spatial Panel Models with Application
to the US Electorate
Jianfeng Wang, Adam B Kashlak
Mathematical & Statistical Sciences
University of Alberta
Edmonton, Canada, T6G 2G1
October 22, 2021
Abstract
The spatial panel regression model has shown great success in modelling econometric
and other types of data that are observed both spatially and temporally with associated
predictor variables. However, model checking via testing for spatial correlations in spatial-
temporal residuals is still lacking. We propose a general methodology for fast permutation
testing of local and global indicators of spatial association.
This methodology extends
past statistics for univariate spatial data that can be written as a gamma index for matrix
similarity to the multivariate and panel data settings. This includes Moran’s I and Geary’s
C among others. Spatial panel models are ﬁt and our methodology is tested on county-wise
electoral results for the ﬁve US presidential elections from 2000 to 2016 inclusive. County-
wise exongenous predictor variables included in this analysis are voter population density,
median income, and percentage of the population that is non-hispanic white.
1
Introduction
Few datasets are more intriguing than those surrounding national elections. Trying to determine
which factors inﬂuence the whims of the electorate is of importance for both historical study
and future predictions. Political and demographic data naturally exists in a spatial temporal
setting. One tool for modelling such data is the spatial panel regression model, which combines
panel data analysis—i.e. where measurements are taken on the same units over an extended
time period—with a spatial dependency network. Details are references are in Section 1.1.
Beyond mere modelling of political performance using spatial panel models, we are interested
in the model’s performance in ﬁtting to the data. To this end, we extend a variety of local
indicators of spatial association (LISA) to apply to the residuals produced by spatial panel
models. Furthermore, these test statistics are evaluated via a quick to compute analytic variant
of the classic permutation test with details and references in Section 1.2. This allows for fast exact
testing of regression residuals for local spatial association across a large graphical network. We
will test our methods on a data set of 5 years × 3104 counties worth of electoral and demographic
data.
The US elections and demographics data is introduced in Section 2. Some regressions and
plots are displayed to give a picture of the data under consideration. A variety of multivariate
LISA statistics are introduced in Section 3.1 and non-asymptotic exact signiﬁcance testing of
such statistics is discussed in Section 3.2. A simulation study on a 50 × 60 rectangular grid is
performed in Section 4 to compare four LISA statistics on multivariate Gaussian data. Fitted
1
arXiv:2110.10622v1  [stat.ME]  20 Oct 2021

spatial panel models on the US elections data are detailed in Section 5.1. LISA tests for the US
data are performed in Section 5.2, and a comparison of the performance of these methods on
this real world data is detailed in Section 5.3.
1.1
Spatial Panel Models
The spatial panel data regression model (Anselin et al., 2001; Baltagi et al., 2003; Kapoor et al.,
2007; Baltagi, 2008) extends panel data models into the spatial realm by accounting for both
random region eﬀects and spatially autocorrelated residuals. This model can be ﬁt to data via
the splm R package (Millo and Piras, 2012).
For yt being the observation vector from all regions at time t, we have
yt = λWyt + Xtβ + ut
(1.1)
ut = ρWut + εt
(1.2)
εt = µ + vt
(1.3)
from Kapoor et al. (2007). The model in Baltagi et al. (2003) is nearly identical but with a
few swapped terms. Equation 1.1 models the observations based on the n × p matrix of non-
stochastic predictors Xt at time t and unknown regression coeﬃcients β. Additionally, y can
have a spatial lag determined by estimating |λ| < 1 and the user selected weight matrix W.
The ut in Equation 1.2 is a spatially autocorrelated process depending on parameter |ρ| < 1.
Equation 1.3 models the innovations vector as µ, a regional random eﬀect with variance σ2
µ,
and vt, iid mean zero normal errors with variance σ2
v. Within the splm package, the function
spgm ﬁts the above model using the generalized moments estimator from Kapoor et al. (2007) In
contrast, the function spml ﬁts a similar model using maximum likelihood as outlined in Baltagi
et al. (2003). Details on both methods can be found in Millo and Piras (2012). In this article,
we will ﬁt the model deﬁned by 1.1, 1.2, and 1.3 using spgm.
1.2
Permutation Testing
There are two paradigmatic approaches to statistical hypothesis testing for spatial data models:
asymptotic normality and permutation testing. The former method is popular due to the rapidity
of producing a p-value, but relies on strong data assumptions and a large enough sample size
to be “asymptotic”.
Much past research (Anselin, 1995, 2019; Seya, 2020) suggests use of
a permutation test instead of the normal approximation. The permutation test (Mielke and
Berry, 2007; Pesarin and Salmaso, 2010; Brombin and Salmaso, 2013; Good, 2013) is an exact
non-parametric approach to statistic hypothesis testing where the data is permuted in order to
capture the behaviour of a test statistic under the null hypothesis. The biggest impediment to its
universal use is the computational burden of simulating massive numbers of permutations of one’s
dataset. To solve this problem, the recent work of Kashlak et al. (2020) proposes an analytic
approach to computing p-values from permutation tests for two-sample and k-sample testing
for complex data types like speech sounds. Such analytic permutation testing was extended to
LISA and GISA statistics for univariate spatial data in Kashlak and Yuan (2020). In this work,
the previously investigated permutation testing framework is extended to spatial-temporal data
speciﬁcally for the spatial panel model.
2
US County-wise Elections Data
County-wise results for US presidential elections are available online via the MIT Election Data
and Science Lab (Data and Lab, 2018).
We will consider the electoral results over the ﬁve
2

Table 1:
Results of a post-hoc Tukey test comparing election years with bolded entries having
signiﬁcant p-values with a test size of 5%.
The values in the table are column-year minus
row-year.
Year
2004
2008
2012
2016
2000
3.29%
-0.10%
2.67%
6.35%
2004
-3.40%
-0.62%
3.06%
2008
2.78%
6.46%
2012
3.68%
presidential elections from 2000 to 2016 inclusive. For this analysis, the states of Alaska and
Hawaii were removed so that the counties considered form a connected graph.
Also, three
island counties were removed similarly as they have no edges in the graph; these are Dukes and
Nantucket county in Massachusetts and San Juan county in Washington. Lastly Broomﬁeld
county, Colorado, was removed from the dataset as it was incorporated in 2001 and thus was
not present for the election of 2000.
The observations considered are the vectors yi ∈[0, 1]5 where yti is the faction of the vote
that went to the Republican candidate; George W Bush; John McCain; Mitt Romney; and
Donald Trump. The predictor variables considered are the voter population density (log-scale),
the state where the county resides (categorical), median income (log-scale) downloaded from
the Bureau of Labor Statistics, Local Area Unemployment Statistics, and the percentage of
the population that identiﬁes as non-Hispanic white according to the U.S. Census Bureau’s
Population Division. The relation between voting behaviour and each of the three continuous
predictor variables is complex. Hence, we consider linear, quadratic, and cubic polynomials for
each of these. Plots of these polynomials and the data are displayed in Figure 1. Comparing
log voter density to Republican vote, there is a noticeable negative trend indicating that the
most densely populated counties vote more heavily for the Democratic candidate. This coincides
with the common assumption that dense cities tend to vote left while sparse rural areas tend to
vote right. There is no strong positive or negative correlation between median income and voter
behaviour, but the cubic regression nevertheless identiﬁes a drop in the Republican vote for the
poorest counties. Lastly, a quadratic polynomial shows that the Republican vote begins to drop
on average as the non-Hispanic white percentage of the population drops below 60%.
For categorical predictors, boxplots of the ﬁve election years are displayed in Figure 2 and
Table 1 displays the results of a post-hoc Tukey test. The boxplots and table indicate that there
is no signiﬁcant diﬀerence in the average countywise Republican vote between years 2000 & 2008
and 2004 & 2012. The average countywise vote was higher in 2016 for Donald Trump than in
the previous years. However, Trump still received a smaller percentage of the popular vote than
Hillary Clinton, the Democratic challenger. Due to the subtleties of the electoral college system
used to elect a US president, a candidate’s county-wise performance does not completely dictate
the outcome of the election. Aggregated over the ﬁve years of elections, the median county-
wise Republican vote was 60.1% with 1st and 3rd quartiles of 50.5% and 69.6%. However, the
overall Republican popular vote—i.e. aggregated Republican votes over all ﬁve elections divided
by total votes cast—is only 47.4%. Boxplots are also plotted in Figure 3 for each US state
aggregated over all counties and the ﬁve elections. These are ordered from smallest to largest
median county Republican vote.
3

−1
0
1
2
3
4
0.2
0.4
0.6
0.8
log10( votes per sq kilometer )
Vote for Republican Candidate
4.4
4.6
4.8
5.0
5.2
0.2
0.4
0.6
0.8
log10( median income )
Vote for Republican Candidate
Republican Votes by Predictors
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
Percentage Non−Hispanic White
Vote for Republican Candidate
Figure 1:
The fraction of the vote for the Republican candidate plotted against log10 voter
population density (top left, R2 = 0.196), log10 median income (top right, R2 = 0.071), and
percentage of the population that is non-Hispanic white (bottom, R2 = 0.143). The blue lines
are cubic polynomials ﬁt by least squares for the top plots and a quadratic polynomial for the
bottom.
4

2000
2004
2008
2012
2016
0.2
0.4
0.6
0.8
year
Vote for Republican Candidate
Figure 2:
Boxplots of the vote percentages for each year in the dataset coloured by the winning
candidate (red = Republican, blue = Democrat).
DC
MA
VT
RI
CT
ME
NJ
NH
NY
CA
WI
DE
NM
MN
IA
WA
SC
MI
MD
AZ
IL
VA
CO
OR
NC
MS
FL
OH
PA
AR
WV
LA
SD
IN
AL
MO
TN
GA
ND
NV
KY
MT
OK
ID
TX
KS
NE
WY
UT
0.2
0.4
0.6
0.8
State
Vote for Republican Candidate
Figure 3:
Boxplots of the vote percentages for each state in the dataset ordered from smallest
to largest median Republican vote.
5

3
Local Spatial Association
3.1
LISA Statistics
Given spatial measurements or residuals from a ﬁtted spatial model, it is desirable to identify
‘hot spots’ being any spatial region where the observations are correlated.
This led to the
development of many Local Indicators of Spatial Association (LISA) statistics (Cliﬀand Ord,
1981; Sokal et al., 1998; Waller and Gotway, 2004; Getis and Ord, 2010; Gaetan and Guyon,
2010; Luo et al., 2019; Seya, 2020). In what follows, we ignore normalizing constants as our
entire approach centers around permutation testing which is invariant to such constants.
For univariate data, we can deﬁne a few diﬀerent LISA statistics. Let G be a graph with n
vertices ν1, . . . , νn and real valued measurements y1, . . . , yn ∈R at each vertex. For any choice
of n × n weight matrix W, Local Moran’s index for vertex i is
Ii ∼
n
X
j=1
wi,j(yi −¯y)(yj −¯y)
where wi,j is the i, jth entry of W. Local Geary’s statistic for vertex i is deﬁned as
Ci ∼
n
X
j=1
wi,j(yi −yj)2.
The rank correlation statistic presented in Kashlak and Yuan (2020) and referred to as binary
association is
Bi ∼βi
n
X
j=1
wi,jβj,
βi =

1,
yi ≥m
−1,
yi < m
for m = median(y). The classic Gettis-Ord statistics are not included here as they are equivalent
to Moran’s index under the permutation methodology.
For multivariate data, let yi ∈RT be a T-long vector for the ith vertex. These univariate
LISA statistics can be extended to this setting as follows. For a ﬁxed symmetric positive deﬁnite
T × T matrix M, an inner product on RT can be deﬁned as ⟨a, b⟩M = aTMb for any a, b ∈RT .
Thus, for any such choice of M, the vector version of Moran’s statistic is
Ii ∼
n
X
j=1
wi,j ⟨yi −¯y, yj −¯y⟩M
where we again ignore scaling constants. For Y = (y1 y2 . . . yn) ∈RT ×n, this statistic can be
quickly computed at every vertex via the formula
I = diag
h
(Y −¯Y )
TM(Y −¯Y )W
i
∈Rn
where diag[A] extracts the diagonal of matrix A and ¯Y = (¯y ¯y . . . ¯y). Geary’s statistic can be
extended to multivariate data by simply considering
Ci ∼
n
X
i=1
wi,j∥yi −yj∥p
ℓp
for some choice of ℓp norm. For the Euclidean norm, this can be computed for all vertices quickly
as ∥yi −yj∥2
ℓ2 = ∥yi∥2
ℓ2 + ∥yj∥2
ℓ2 −2 ⟨yi, yj⟩. Noting that the matrix Y TY has i, jth entry ⟨yi, yj⟩
and setting d = diag(Y TY ),
C = diag

[(d ⊕d) −2Y TY ]W
	
∈Rn
6

Table 2:
Choices of matrices A and B to get a γi equivalent to one of the LISA statistics.
LISA
Ai,j
Bi,j
Moran
wi,j
⟨yi −¯y, yj −¯y⟩M
Geary
wi,j
∥yi −yj∥p
ℓp
Binary
wi,j
⟨βi, βj⟩
where (d ⊕d)i,j = di + dj. Lastly, the binary association statistic can be simply generalized to
the multivariate setting in the same manor as with Moran’s index:
Bi ∼
n
X
j=1
wi,j ⟨βi, βj⟩,
βi,k =

1,
yi,k ≥mk
−1,
yi,k < mk
for mk = median(yi,k : i = 1, . . . , n). For binary association, we just consider the standard
Euclidean inner product, i.e. the dot product.
3.2
Signiﬁcance Testing
3.2.1
Local Testing
A general approach to testing LISA statistics by analytically bounding the permutation test p-
value is introduced in Kashlak and Yuan (2020). Here, we extend this work to the multivariate
setting.
The gamma index (Mantel, 1967; Hubert, 1985) is a general measure of matrix similarity,
γAB := Pn
i,j=1 ai,jbi,j, for two similar matrices A and B. For speciﬁc choices of A and B, the
gamma index can be treated as a general correlation statistic. The local gamma index (Anselin,
1995) is similarly deﬁned as γi = Pn
j=1 ai,jbi,j. The formulae from Section 3.1 can be rewritten
in terms of a local gamma index by choosing A to be the weight matrix W and B to be one
of the entries in Table 2. To align with the notation of Anselin (1995) and Kashlak and Yuan
(2020), we will write ai,j = wi,j, bi,j = λ(yi, yj) for some similarity function λ, and ﬁnally that
γi = Pn
j=1 wi,jλ(yi, yj). Using this notation, we can deﬁne the permutated local gamma index
to be γi(π) = Pn
j=1 wi,jλ(yi, yπ(j)) where π is a uniformly random element of Sn, the symmetric
group on n elements, conditioned so that π(i) = i.
In Kashlak and Yuan (2020), it is proven that for any such gamma index constructed with
a binary weight matrix—i.e.
wi,j = 0, 1 for all i, j = 1, . . . , n—the following concentration
inequality holds when mi = Pn
j=1 wi,j ≪n/2 or mi ≫n/2,
P
 |γi(π) −mi¯λ−i| ≥γi | y1, . . . , yn

≤
1
√π Γ

n −1
mi(n −mi −1)
γ2
i
2s2
i
;1
2

+ O(n−4)
(3.1)
where P( ) is the uniform probability measure on the symmetric group conditioned so that
π(i) = i, ¯λ−i = (n −1)−1 Pn
j=1 λ(yi, yj)1i̸=j, Γ( ) is the upper incomplete gamma function, and
s2
i = (n −1)−1 P
j̸=i(λi,j −¯λ−i)2 is the sample variance of the ith row.
Remark 3.1. Typically, for planar data such as that considered in this work, mi ≪n/2 for all
vertices i = 1, . . . , n. However, in the case where mi ∼n/2, we deﬁne
ϖ−= min{mi, n −mi −1}/max{mi, n −mi −1}2, and
ϖ+ = max{mi, n −mi −1}/min{mi, n −mi −1}2.
7

Then, the concentration inequality becomes
P
 |γi(π) −mi¯λ−i| ≥γi | y1, . . . , yn

≤C0I

exp

−γ2
i
2s2
i
ϖ−

;(n −1)ϖ+, 1
2

where I[·] is the regularized incomplete beta function, and C0 =
√
(n−1)ϖ+Γ((n−1)ϖ+)
Γ( 1
2 +(n−1)ϖ+)
with Γ(·)
the (complete) gamma function.
3.2.2
Global Testing
Statistics for local indicators of spatial association can be extended to statistics for global in-
dicators of spatial association (GISA). In Kashlak and Yuan (2020), a concentration inequality
similar to that for LISA statistics is proved. Given the same setup as the previous section, then
P
 γ(π) −
n
X
i=1
mi¯λ−i
 ≥γ | y1, . . . , yn
!
≤
1
√π Γ
 γ2
4υ2 ;1
2

+ O(2−2n)
(3.2)
where π = (π1, . . . , πn) with πi(i) = i, γ(π) = Pn
i=1 γi(πi) is the permuted variant of this test
statistic, and υ2 = Pn
i=1 ηis2
i with ηi = mi(n −mi −1)/(n −1).
The GISA tests apply to the same statistics as the LISA tests do. These GISA test statistics
are more directly comparable with the LM and LR tests considered in Breusch and Pagan (1980);
Anselin and Bera (1998); Baltagi et al. (2003); Hsiao (2014) and others.
4
Simulation Studies
4.1
LISA Tests
Four LISA statistics—Moran, Geary ℓ2 and ℓ1, and Binary association—are tested on simulated
data on a (50×60)-vertex rectangular grid where up to four edges exist for each vertex connecting
it to those vertices above and below and to the left and to the right.
Denoting the graph
adjacency matrix as A, we generate 200 replicates of mean zero 5×3000 dimensional multivariate
Gaussian data with covariance I + cA where c ∈[−0.25, 0.25].
Figure 4 charts the number of vertices with p-values less than 5% for each of the four LISA
statistics with p-values computed via formula 3.1. As c increases from zero, Moran’s statistic is
shown to identify the most signiﬁcant vertices followed by Geary with the ℓ1 norm. Geary ℓ2 and
binary association perform the worst. In contrast, for c decreasing invoking negative correlations
between adjacent vertices, the power curves for Moran and Geary ℓ2 show similar performance
with Geary ℓ1 identifying fewer signiﬁcant vertices and with binary association performing the
worst.
In these simulations, the binary association statistic performed the worst in both testing
scenarios.
However, in the next section, it is shown to have good performance on the US
elections data while Geary with the ℓ2 norm identiﬁed the fewest hot spots. Moran’s statistic
consistently achieves the highest power in all testing scenarios. However, the other three methods
often identify vertices missed by Moran’s statistic. This suggests that these LISA statistics can
be used to complement one another in an exploratory analysis.
4.2
GISA Tests
Testing for global spatial association (GISA) is also possible using formula 3.2.
Computing
p-values for the same simulated data and test statistics considered above results in Figure 5.
8

0.00
0.05
0.10
0.15
0.20
0.25
0.00
0.05
0.10
0.15
covariance
Percent Rejected
Moran
Geary l2
Binary
Geary l1
Positive Correlation
0.00
0.05
0.10
0.15
0.20
0.25
0.00
0.05
0.10
0.15
covariance
Percent Rejected
Moran
Geary l2
Binary
Geary l1
Negative Correlation
Figure 4:
Power curves for the four LISA tests as the covariance increases in the positive
direction (left) and negative direction (right).
For positively correlated neighbours, the global version of Moran’s test gives the best statistical
power. Geary’s ℓ2 and then ℓ1 statistics are next, and the global binary association test performs
the worst, but still is able to achieve a good amount of power to identify the presence of global
spatial association.
Similar behaviour is seen for a negatively correlated network.
Of note,
Geary’s ℓ2 global test gives slightly higher power than Geary’s ℓ1 whereas for local testing,
Geary with the ℓ1 norm has stronger performance than ℓ2 for positively correlated data.
In the real data sections to follow, we do not discuss GISA testing as all of the p-values are
extremely small for all four statistics and all ﬁve neighbourhoods considered. This is evident in
the LISA plots displayed below in Figure 7, which show many areas of spatial association under
each of the four statistics considered.
5
Real Data Results
5.1
Spatial Panel Data Regression
Fitting and testing a spatial panel model requires selection of a weight matrix. To coincide with
the theorems in Kashlak and Yuan (2020), we will only consider binary weight matrices—i.e.
those with entries 0 and 1 only. Among such weight matrices, we denote Wk to be the k-lagged
weight matrix. Beginning with W0 = I and W1 being the graph adjacency matrix, we recursively
deﬁne
Wk = W k
1
k−1
Y
l=0
(1 −Wl)
where 1 is the n × n matrix of all ones. We will consider neighbourhoods for k = 1, . . . , 5.
The ﬁtted spatial panel models are compared in Table 3. Both λ, the spatial lag parameter,
and ρ, the spatial autocorrelated errors parameter, decrease in magnitude as the lag increases.
Thus, the spatial dependence in the data begins to wane as the neighbourhood spreads out.
9

0.00
0.01
0.02
0.03
0.04
0.0
0.2
0.4
0.6
0.8
1.0
covariance
Percent Rejected
Moran
Geary l2
Binary
Geary l1
Positive Correlation
0.00
0.01
0.02
0.03
0.04
0.0
0.2
0.4
0.6
0.8
1.0
covariance
Percent Rejected
Moran
Geary l2
Binary
Geary l1
Negative Correlation
Figure 5:
Power curves for the four GISA tests as the covariance increases in the positive
direction (left) and negative direction (right).
Table 3:
A comparison of spatial panel models for neighbourhoods deﬁned by lags 1,. . . ,5.
Lag
1
2
3
4
5
λ
0.0225
0.0094
0.0035
0.0018
0.0013
ρ
-0.0779
-0.0180
0.0051
0.0086
0.0088
σ2
v
0.0024
0.0023
0.0024
0.0023
0.0023
σ2
µ
0.0242
0.0238
0.0238
0.0238
0.0239
R2
59.4%
62.2%
62.9%
63.1%
63.1%
Thus, in the next section, we will focus on testing for local spatial association with the adjacency
weight matrix. The R2 values for the ﬁve ﬁtted models all are around 60%.
Fitting a spatial panel data regression model to this data captures the behaviour of voting in
most US counties. However, extreme counties with respect to the predictors result in inaccurate
or erroneous predictions. On the political left, the counties of New York City and the Bronx both
have ﬁtted values less than zero. NYC has the highest voter density with between 7000 and 8000
voters per square kilometre across the ﬁve elections. The next densest county is King’s County,
NY with approximately 3500 voters per square kilometer. The Bronx is also one of the counties
with the densest population, but while NYC is about 47% white and King’s is 37.5% white, the
Bronx is less than 10% white. On the political right, both King and Loving county, Texas, get
ﬁtted values greater than 100%. This is mainly due to being very sparsely populated (∼1 voter
per 20 square kilometers ) and in Texas, which gives a boost to the expected Republican vote
as discussed next.
Inclusion of the state as a categorical predictor allows us to identify those states that vote
more or less favourably for the Republican candidate than the three continuous predictors—
population density, median income, and non-Hispanic whiteness—would suggest. Most notably,
the state of Texas has an estimated upward shift of 19.2% indicating that Texan counties on
average vote more heavily for the Republican candidate than expected given the other predictors.
10

−1
0
1
2
3
4
0.2
0.4
0.6
0.8
log10( votes per sq kilometer )
Vote for Republican Candidate
4.4
4.6
4.8
5.0
5.2
0.2
0.4
0.6
0.8
log10( median income )
Vote for Republican Candidate
0.0
0.2
0.4
0.6
0.8
1.0
0.0
0.2
0.4
0.6
0.8
1.0
Percentage Non−Hispanic White
Vote for Republican Candidate
Figure 6:
A copy of Figure 1 with the Texas counties coloured in red. This demonstrates a
higher percentage of Texan vote going towards the Republican candidate than expected by the
three predictors.
This is visualized in Figure 6, which shows the ﬁtted regression lines for the three predictors
with Texan counties coloured in red. The majority of Texan counties lie above the regression
lines.
5.2
Local Spatial Association
Four multivariate LISA statistics are applied to the residuals for the ﬁtted models. These are
Moran’s I, Geary’s statistic in both ℓ2 and ℓ1, and binary association. The p-values produced
by formula 3.1 are subsequently adjusted for multiple testing using the Benjamini-Hochberg
method for False Discovery Rate (FDR) control within the spatial framework making use of the
p.adjustSP function in the spdep R package (Bivand and Wong, 2018; Bivand et al., 2013). The
p-values reported were computed with W1, the adjacency matrix, as the chosen weight matrix.
Similar results were seen for other weight matrices.
Counties with signiﬁcant p-values under each of the LISA statistics are displayed in Figure 7,
which are coloured by red or blue depending on whether the Republican or Democratic candidate
took more than 50% of the county’s vote in three or more of the years considered. This allows
for the visualization of hot spots where the residuals all trend in a similar direction. Moran’s
statistic ﬁnds the most signiﬁcant counties of the four methods; 18.5% of the 3104 counties are
designated as signiﬁcant at an FDR of 5%. Binary association comes next at 13.6%, Geary’s
statistic with the ℓ1 norm at 9.4% and lastly Geary’s statistic with ℓ2 with only 2.6% of the
counties deemed to have signiﬁcant spatial association after correcting for multiple testing. The
lighter colours indicate signiﬁcant at 5% FDR and the darker colours indicate signiﬁcant at 1%
FDR.
LISA statistics can detect both signiﬁcant positive and negative spatial association. However,
nearly all of the counties detected in Figure 7 are due to positive association. As these statistics
are computed on the residuals produced from the spatial panel model, positive spatial association
implies the existence of a geographic region where the residuals trend in the same direction.
These are geographic voting blocks whose votes trend in a similar direction even after the
three predictors and US state are taken into account. One of the most noticeable correlated
collections of counties is the Republican voting north-south region spanning from west Texas
upward through the Kansas-Colorado border. Another Republican voting block exists around
the boarder of Kentucky with Virginia and West Virginia extending into Tennessee. The Moran,
11

Figure 7:
Maps displaying signiﬁcant counties with respect to a given LISA statistic after
adjusting for multiple testing. The percentage of signiﬁcant counties in displayed in the plot
titles.
binary, and ℓ1 Geary maps highlight a Democrat voting region along the Mississippi river as it
separates the states of Iowa and Minnesota from Illinois and Wisconsin. The Moran map also
detects signiﬁcant spatial autocorrelation along the left-voting west coast counties.
5.3
Comparison of LISA Statistics
In the previous section, Figure 7 shows that the number of signiﬁcant counties detected can
vary a lot between the four methods. To further contrast the four methods, we can consider
2 × 2 tables for the agreements and disagreements between each pair of methods. In Table 4,
the Matthews Correlation Coeﬃcient,
MCC =
TP × TN −FP × FN
p
(TP + FP)(TP + FN)(TN + FP)(TN + FN)
,
and the Rand index,
Rand =
TP + TN
TP + FP + FN + TN ,
are computed for each pairing of methods. For simplicity of notation, we use true/false posi-
tive/negative to refer to the methods agreeing or disagreeing on which counties are chosen have
signiﬁcant spatial association.
The Rand index or percentage of agreement is above 80% for each pairing of methods. This
is due to most of the counties being deemed non-signiﬁcant (TN) by all methods. The MCC
12

Table 4:
A comparison of the four LISA statistics using Matthews Correlation (left) and using
the Rand Index (right).
MCC
Rand
Geary ℓ2
Geary ℓ1
Binary
Geary ℓ2
Geary ℓ1
Binary
Moran
0.232
0.437
0.447
0.828
0.855
0.849
Geary ℓ2
0.394
0.132
0.860
0.922
Geary ℓ1
0.412
0.878
gives a more nuanced comparison of the methods. Higher MCCs are seen for Moran and Binary
association, both of which are of the form of an inner product, and for Geary ℓ1 and ℓ2, both of
which are of the form of an ℓp norm.
6
Discussion and Extensions
There are many ways to test for local and global spatial association for spatial panel data
models. In this work, we have seen that the extension of Moran’s statistic to the multivariate
domain gives the best power to identify local clusters of spatially dependent regions. However,
other methods have both comparable statistical power and typically identify diﬀerent signiﬁcant
regions. Thus, an ensemble approach to detecting spatially dependent regions is warranted.
The extensions of LISA statistics presented in Section 3.1 are based on inner products (Moran
and binary association) and on ℓp norms (Geary). This naturally can be further applied to
spatially observed functional data such as climate data, linguistic data, and others (Delicado
et al., 2010; Menafoglio and Petris, 2016; Tavakoli et al., 2019). Panel data, longitudinal data,
functional data, and time series data are all closely related objects of study that often are
collected from a spatial domain. Our methodology is extendable to such areas of analysis.
Acknowledgements
The authors would like to thank the Natural Sciences and Engineering Research Council of
Canada (NSERC) for their funding support.
References
Luc Anselin. Local indicators of spatial association—lisa. Geographical analysis, 27(2):93–115,
1995.
Luc Anselin. A local indicator of multivariate spatial association: extending geary’s c. Geo-
graphical Analysis, 51(2):133–150, 2019.
Luc Anselin and Anil K Bera. Spatial dependence in linear regression models with an introduc-
tion to spatial econometrics. Statistics textbooks and monographs, 155:237–290, 1998.
Luc Anselin et al. Spatial econometrics. A companion to theoretical econometrics, 310330, 2001.
Badi H Baltagi, Seuck Heun Song, and Won Koh. Testing panel data regression models with
spatial error correlation. Journal of econometrics, 117(1):123–150, 2003.
Badi Hani Baltagi. Econometric analysis of panel data, volume 4. Springer, 2008.
13

Roger Bivand and David W. S. Wong. Comparing implementations of global and local indica-
tors of spatial association. TEST, 27(3):716–748, 2018. URL https://doi.org/10.1007/
s11749-018-0599-x.
Roger S. Bivand, Edzer Pebesma, and Virgilio Gomez-Rubio. Applied spatial data analysis with
R, Second edition. Springer, NY, 2013. URL http://www.asdar-book.org/.
Trevor S Breusch and Adrian R Pagan. The lagrange multiplier test and its applications to
model speciﬁcation in econometrics. The review of economic studies, 47(1):239–253, 1980.
Chiara Brombin and Luigi Salmaso. Permutation tests in shape analysis, volume 15. Springer,
2013.
Andrew David Cliﬀand J Keith Ord. Spatial processes: models & applications. Taylor & Francis,
1981.
MIT Election Data and Science Lab. County Presidential Election Returns 2000-2020, 2018.
URL https://doi.org/10.7910/DVN/VOQCHQ.
Pedro Delicado, Ram´on Giraldo, Carlos Comas, and Jorge Mateu. Statistics for spatial functional
data: some recent contributions. Environmetrics: The oﬃcial journal of the International
Environmetrics Society, 21(3-4):224–239, 2010.
Carlo Gaetan and Xavier Guyon. Spatial statistics and modeling, volume 90. Springer, 2010.
Arthur Getis and J Keith Ord. The analysis of spatial association by use of distance statistics.
In Perspectives on spatial data analysis, pages 127–145. Springer, 2010.
Phillip Good. Permutation tests: a practical guide to resampling methods for testing hypotheses.
Springer Science & Business Media, 2013.
Cheng Hsiao. Analysis of panel data. Number 54. Cambridge university press, 2014.
Lawrence J Hubert.
Combinatorial data analysis: association and partial association.
Psy-
chometrika, 50(4):449–467, 1985.
Mudit Kapoor, Harry H Kelejian, and Ingmar R Prucha. Panel data models with spatially
correlated error components. Journal of econometrics, 140(1):97–130, 2007.
Adam B Kashlak and Weicong Yuan. Computation-free nonparametric testing for local and
global spatial autocorrelation with application to the canadian electorate.
arXiv preprint
arXiv:2012.08647, 2020.
Adam B Kashlak, Sergii Myroshnychenko, and Susanna Spektor. Analytic permutation testing
via Kahane–Khintchine inequalities. arXiv preprint arXiv:2001.01130, 2020.
Qing Luo, Daniel A Griﬃth, and Huayi Wu. Spatial autocorrelation for massive spatial data:
veriﬁcation of eﬃciency and statistical power asymptotics. Journal of Geographical Systems,
21(2):237–269, 2019.
Nathan Mantel.
The detection of disease clustering and a generalized regression approach.
Cancer research, 27(2 Part 1):209–220, 1967.
Alessandra Menafoglio and Giovanni Petris. Kriging for hilbert-space valued random ﬁelds: The
operatorial point of view. Journal of Multivariate Analysis, 146:84–94, 2016.
14

Paul W Mielke and Kenneth J Berry.
Permutation methods: a distance function approach.
Springer Science & Business Media, 2007.
Giovanni Millo and Gianfranco Piras. splm: Spatial panel data models in R. Journal of Statistical
Software, 47(1):1–38, 2012. URL http://www.jstatsoft.org/v47/i01/.
Fortunato Pesarin and Luigi Salmaso. Permutation tests for complex data: theory, applications
and software. John Wiley & Sons, 2010.
Hajime Seya. Global and local indicators of spatial associations. In Spatial Analysis Using Big
Data, pages 33–56. Elsevier, 2020.
Robert R Sokal, Neal L Oden, and Barbara A Thomson. Local spatial autocorrelation in a
biological model. Geographical Analysis, 30(4):331–354, 1998.
Shahin Tavakoli, Davide Pigoli, John AD Aston, and John S Coleman.
A spatial modeling
approach for linguistic object data: Analyzing dialect sound variations across great britain.
Journal of the American Statistical Association, 114(527):1081–1096, 2019.
Lance A Waller and Carol A Gotway. Applied spatial statistics for public health data, volume
368. John Wiley & Sons, 2004.
15
